class: center, middle, inverse, title-slide # Day 5 ## Data wrangling ### Michael W. Kearney📊
School of Journalism
Informatics Institute
University of Missouri ###
@kearneymw
@mkearney
--- class: inverse, center, middle ## Agenda --- ## Agenda + Computing basics + Tidyverse - Visualizing - Wrangling Basics - Merging & Joining - Transforming & Cleaning --- class: inverse, center, middle ## Computing basics --- ## File/directory awareness + Every computer has a file/directory system + There's a default or home folder ```r ## use this to see your home folder normalizePath("~") #> [1] "/Users/kearneymw" ``` + Files are organized like a tree - You can move from one branch to another - Moving from one folder to another is linear --- ## Bash shortcuts + Most popular conventions from bash/terminal found in R scripts - **`..`** to go back a folder or **`../..`** to go back two folders - **`~`** as a shortcut for home, e.g., **`~/R/stat`** --- ## Don't fall asleep yet! Why should you care? + Because you need to know where things are to interact with them! + Because you want to be able to replicate your work! > **tl;dr**: Open your file browser. > **tl;dr**: See how they're organized? > **tl;dr**: Yeah? Good! --- ## Don't let these terms confuse you + `File` is the **name**/location of a file + `Path` is the name/**location** of a file + `Folder` is the **name**/location of a file folder + `Directory` is the name/**location** of a file folder <p class="note">file == path & folder == directory</p> --- class: inverse, center, middle ## Do unto <huge>future you</huge> </br> as you would have future you </br> do unto <huge> you</huge>. --- ## FutuRe you + Using scripts and writing clear code makes life easier + It's like writing down a routine of pointing/clicking you don't have to memorize + The internet is full of routines/scripts you can edit and customize to your liking + With some notes and extra attention, future you will be very happy with current you! --- class: inverse, center, middle ## Tidyverse --- ## Cheatsheets + Check out these [Rstudio cheatsheets](https://www.rstudio.com/resources/cheatsheets/) <p style="align:center"> <img src="img/cheatsheets.png" </img> </p> --- ## Notebooks 1. [Visualizing](../tidyverse/01-visualize.Rmd) 1. [Wrangling (basics)](../tidyverse/02-wrangle.Rmd) 1. [Merging & Joining](../tidyverse/03-merge-and-join.Rmd) 1. [Transforming & Cleaning](../tidyverse/04-transform-and-clean.Rmd) <style> huge {font-size:1.7em;} </style> <!-- ## Data wrangling + Computing basics + Data frames - Wrangle: `select()`, `filter()`, `arrange()`, `mutate()`, `transmute()` - Bind: `bind_rows()`, `bind_cols()` - Join: `left_join()`, `full_join()`, `right_join()` + Text - Capitalize: `to_upper()`, `to_lower()` - RegEx: `str_replace()`, `str_remove()`, `str_extract()`, `str_detect()` - Split: `str_split()` + Numbers - Numbers (dollar sign, commas, etc.) - Dates/Times -->